224 research outputs found
Selective Amnesia: On Efficient, High-Fidelity and Blind Suppression of Backdoor Effects in Trojaned Machine Learning Models
In this paper, we present a simple yet surprisingly effective technique to
induce "selective amnesia" on a backdoored model. Our approach, called SEAM,
has been inspired by the problem of catastrophic forgetting (CF), a long
standing issue in continual learning. Our idea is to retrain a given DNN model
on randomly labeled clean data, to induce a CF on the model, leading to a
sudden forget on both primary and backdoor tasks; then we recover the primary
task by retraining the randomized model on correctly labeled clean data. We
analyzed SEAM by modeling the unlearning process as continual learning and
further approximating a DNN using Neural Tangent Kernel for measuring CF. Our
analysis shows that our random-labeling approach actually maximizes the CF on
an unknown backdoor in the absence of triggered inputs, and also preserves some
feature extraction in the network to enable a fast revival of the primary task.
We further evaluated SEAM on both image processing and Natural Language
Processing tasks, under both data contamination and training manipulation
attacks, over thousands of models either trained on popular image datasets or
provided by the TrojAI competition. Our experiments show that SEAM vastly
outperforms the state-of-the-art unlearning techniques, achieving a high
Fidelity (measuring the gap between the accuracy of the primary task and that
of the backdoor) within a few minutes (about 30 times faster than training a
model from scratch using the MNIST dataset), with only a small amount of clean
data (0.1% of training data for TrojAI models)
Large Language Model Soft Ideologization via AI-Self-Consciousness
Large language models (LLMs) have demonstrated human-level performance on a
vast spectrum of natural language tasks. However, few studies have addressed
the LLM threat and vulnerability from an ideology perspective, especially when
they are increasingly being deployed in sensitive domains, e.g., elections and
education. In this study, we explore the implications of GPT soft
ideologization through the use of AI-self-consciousness. By utilizing GPT
self-conversations, AI can be granted a vision to "comprehend" the intended
ideology, and subsequently generate finetuning data for LLM ideology injection.
When compared to traditional government ideology manipulation techniques, such
as information censorship, LLM ideologization proves advantageous; it is easy
to implement, cost-effective, and powerful, thus brimming with risks
Recommended from our members
Identifying Repeat Domains in Large Genomes
We present a graph-based method for the analysis of repeat families in a repeat library. We build a repeat domain graph that decomposes a repeat library into repeat domains, short subsequences shared by multiple repeat families, and reveals the mosaic structure of repeat families. Our method recovers documented mosaic repeat structures and suggests additional putative ones. Our method is useful for elucidating the evolutionary history of repeats and annotating de novo generated repeat libraries
De novo identification of LTR retrotransposons in eukaryotic genomes
BACKGROUND: LTR retrotransposons are a class of mobile genetic elements containing two similar long terminal repeats (LTRs). Currently, LTR retrotransposons are annotated in eukaryotic genomes mainly through the conventional homology searching approach. Hence, it is limited to annotating known elements. RESULTS: In this paper, we report a de novo computational method that can identify new LTR retrotransposons without relying on a library of known elements. Specifically, our method identifies intact LTR retrotransposons by using an approximate string matching technique and protein domain analysis. In addition, it identifies partially deleted or solo LTRs using profile Hidden Markov Models (pHMMs). As a result, this method can de novo identify all types of LTR retrotransposons. We tested this method on the two pairs of eukaryotic genomes, C. elegans vs. C. briggsae and D. melanogaster vs. D. pseudoobscura. LTR retrotransposons in C. elegans and D. melanogaster have been intensively studied using conventional annotation methods. Comparing with previous work, we identified new intact LTR retroelements and new putative families, which may imply that there may still be new retroelements that are left to be discovered even in well-studied organisms. To assess the sensitivity and accuracy of our method, we compared our results with a previously published method, LTR_STRUC, which predominantly identifies full-length LTR retrotransposons. In summary, both methods identified comparable number of intact LTR retroelements. But our method can identify nearly all known elements in C. elegans, while LTR_STRUCT missed about 1/3 of them. Our method also identified more known LTR retroelements than LTR_STRUCT in the D. melanogaster genome. We also identified some LTR retroelements in the other two genomes, C. briggsae and D. pseudoobscura, which have not been completely finished. In contrast, the conventional method failed to identify those elements. Finally, the phylogenetic and chromosomal distributions of the identified elements are discussed. CONCLUSION: We report a novel method for de novo identification of LTR retrotransposons in eukaryotic genomes with favorable performance over the existing methods
- …